Algorithms for the Problems of Length-Constrained Heaviest Segments
نویسندگان
چکیده
We present algorithms for length-constrained maximum sum segment and maximum density segment problems, in particular, and the problem of finding length-constrained heaviest segments, in general, for a sequence of real numbers. Given a sequence of n real numbers and two real parameters L and U (L 6 U), the maximum sum segment problem is to find a consecutive subsequence, called a segment, of length at least L and at most U such that the sum of the numbers in the subsequence is maximum. The maximum density segment problem is to find a segment of length at least L and at most U such that the density of the numbers in the subsequence is the maximum. For the first problem with nonuniform width there is an algorithm with time and space complexities in O(n). We present an algorithm with time complexity in O(n) and space complexity in O(U). For the second problem with non-uniform width there is a combinatorial solution with time complexity in O(n) and space complexity in O(U). We present a simple geometric algorithm with the same time and space complexities. We extend our algorithms to respectively solve the length-constrained k maximum sum segments problem in O(n + k) time and O(max{U, k}) space, and the length-constrained k maximum density segments problem in O(nmin{k, U − L}) time and O(U + k) space. We present extensions of our algorithms to find all the length-constrained segments having user specified sum and density in O(n + m) and O(n log(U − L) + m) times respectively, where m is the number of output. Previously, there was no known algorithm with non-trivial result for these problems. We indicate the extensions of our algorithms to higher dimensions. All the algorithms can be extended in a straight forward way to solve the problems with non-uniform width and non-uniform weight. The algorithms have potential applications in different areas of biomolecular sequence analysis including finding CG-rich regions, TA and CGdeficient regions, CpG islands and regions rich in periodical three-base patterns, post processing sequence alignment, annotating multiple sequence alignments, and computing length constrained ungapped local alignment. They also have applications in other areas such as pattern recognition, digital image processing and data mining. ? Research supported by an NSERC discovery grant awarded to this author. ar X iv :1 10 8. 49 72 v1 [ cs .C G ] 2 5 A ug 2 01 1 2 Md. Shafiul Alam and Asish Mukhopadhyay
منابع مشابه
Optimal Algorithms for Finding Density-Constrained Longest and Heaviest Paths in a Tree
Let T be a tree with n nodes, in which each edge is associated with a length and a weight. The density-constrained longest (heaviest) path problem is to find a path of T with maximum path length (weight) whose path density is bounded by an upper bound and a lower bound. The path density is the path weight divided by the path length. We show that both problems can be solved in optimal O(n logn) ...
متن کاملEfficient Algorithms for Locating the Length-Constrained Heaviest Segments, with Applications to Biomolecular Sequence Analysis
We study two fundamental problems concerning the search for interesting regions in sequences: (i) given a sequence of real numbers of length n and an upper bound U , find a consecutive subsequence of length at most U with the maximum sum and (ii) given a sequence of real numbers of length n and a lower bound L, find a consecutive subsequence of length at least L with the maximum average. We pre...
متن کاملOPTIMAL CONSTRAINED DESIGN OF STEEL STRUCTURES BY DIFFERENTIAL EVOLUTIONARY ALGORITHMS
Structural optimization, when approached by conventional (gradient based) minimization algorithms presents several difficulties, mainly related to computational aspects for the huge number of nonlinear analyses required, that regard both Objective Functions (OFs) and Constraints. Moreover, from the early '80s to today's, Evolutionary Algorithms have been successfully developed and applied as a ...
متن کاملDefinitions and Algorithms in SEGID
Given a (multiple) sequence alignment, SEGID first converts it into a sequence of numbers, where each number is the alignment score of a column. (SEGID also directly accepts a sequence of numbers as input.) Then it provides three algorithms to identify conserved segments (high score substrings): 1. Longest segment (with average value lower bound): given a string of numbers and a number A, find ...
متن کاملA multi-objective resource-constrained optimization of time-cost trade-off problems in scheduling project
This paper presents a multi-objective resource-constrained project scheduling problem with positive and negative cash flows. The net present value (NPV) maximization and making span minimization are this study objectives. And since this problem is considered as complex optimization in NP-Hard context, we present a mathematical model for the given problem and solve three evolutionary algorithms;...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1108.4972 شماره
صفحات -
تاریخ انتشار 2011